# Linear Factor Pricing Models

## Notation

| Notation | Description |
|---------------------------------------------------------------------|-------------------------------------------------------------|
| $\tilde{r}$ | excess return rate over the period |
| $\tilde{r}^{\scriptscriptstyle {i}}$ | arbitrary asset $i$ |
| $\tilde{r}^{\scriptscriptstyle {p}}$ | arbitrary portfolio $p$ |
| $\tilde{r}^{\scriptscriptstyle {\texttt{t}}}$ | tangency portfolio |
| $\tilde{r}^{\scriptscriptstyle {m}}$ | market portfolio |
| $\tilde{r}^{\scriptscriptstyle {s}}$ | size portfolio |
| $\tilde{r}^{\scriptscriptstyle {v}}$ | value portfolio |
| $\beta^{\scriptscriptstyle {i,j}}$ | regression beta of $\tilde{r}^{\scriptscriptstyle {i}}$ on $\tilde{r}^{\scriptscriptstyle {j}}$ |

# Fama-French

## Fama-French model

The **Fama-French 3-factor model** is one of the most well-known multifactor models.
$$ 
\mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {i}}\right] = \beta^{i,m}\; \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {m}}\right] + \beta^{i,s}\; \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {s}} \right] + \beta^{i,v} \; \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {v}}\right]
$$

* $\tilde{r}^{\scriptscriptstyle {m}}$ is the excess market return as in the CAPM.
* $\tilde{r}^{\scriptscriptstyle {s}}$ is a portfolio that goes long small stocks and shorts large stocks.
* $\tilde{r}^{\scriptscriptstyle {v}}$ is a portfolio that goes long value stocks and shorts growth stocks.

### Use of growth and value

The labels "growth" and "value" are widely used.

- Historically, value stocks have delivered higher average returns.
- So-called "value" investors try to take advantage of this by looking for stocks with low market price per fundamental or per cash-flow.
- Much research has been done to try to explain this difference of returns and whether it is reflective of risk.
- Many funds (ETF, mutual funds, hedge funds) orient themselves around being "value" or "growth".

### FF Measure of Value

The **book-to-market** (B/M) ratio is the market value of equity divided by the book (balance sheet) value of equity.

* High B/M means strong (accounting) fundamentals per market-value-dollar.
* High B/M are **value** stocks.
* Low B/M are **growth** stocks.

For portfolio value factor, this is the most common measure.

### Other value measures

Many other measures of value based on some cash-flow or accounting value per market price.

* **Earnings-price** is a popular metric beyond value portfolios. Like B/M, the E/P ratio is accounting value per market valuation.
* **EBITDA-price** is similar, but uses accounting measure of profit that ignores taxes, financing, and depreciation.
* **Dividend-price** uses common dividends, but less useful for individual firms as many have no dividends.

Many other measures, and many competing claims to special/better measure of 'value'.

### Other Popular Factors

Sort portfolios of equities based on...

* Price movement. Momentum, mean reversion, etc.
* Volatility. Realized return volatility, market beta, etc.
* Profitability.*
* Investment.*

\*As measured in financial statements.

### Characteristics or Betas?

LFPM says security's **beta** matters, not its measure of the **characteristic**.

* So what does FF model expect of a stock with high B/M yet low correlation to other high B/M stocks?
* Beta earns premium---not the stock's characteristic.
* This is one difference between FF ``value'' investing and Buffett-Graham ``value'' investing.

### Testing the model

Testing these LFMs is analogous to testing the CAPM.

* Time-series test.
* Cross-sectional test.
* Statistical significance through chi-squared test of alphas. (ie Do the factors span the MV frontier?)

### Finding the right factors

Hundreds of tests and papers written about LFMs! Does $z^j$ help the model given the other $\boldsymbol{z}$?

* Really asking whether $z^j$ adds to the MV frontier generated by $\boldsymbol{z}$.
* Calculate factor MV:
$$ \boldsymbol{w} = \boldsymbol{\Sigma}_{\boldsymbol{z}}^{-1}\boldsymbol{\lambda}_{\boldsymbol{z}} \frac{1}{\gamma} $$
* Any significant weight on factor $z^j$?
* Easy to formally test this using t-stat, chi-squared test, etc.

# Momentum

## Return autoregressions: momentum or reversion?

With the overall market index, there is no clear evidence of momentum or mean-reversion.
$$ r^{\scriptscriptstyle {m}}_{t+1} = \alpha + \beta r^{\scriptscriptstyle {m}}_t + \epsilon_{t+1} $$

The autoregression does not find $\beta$ to be significant, (statistically, economically).

> #### Footnote
> Of course, we can write this regression as
> $$\left(r^{\scriptscriptstyle {m}}_{t+1} - \mu\right) = \beta \left(r^{\scriptscriptstyle {m}}_t - \mu\right) + \epsilon_{t+1}$$
> where $\mu$ is the mean of $r^{\scriptscriptstyle {m}}$, and $\alpha = (1-\beta)\mu$.

## Autocorrelation of individual stocks

What about individual stocks? Is there significant autocorrelation in their returns?

* At a monthly level, most equities would have no higher than $\beta = 0.05$.
* Thus, for a long time the issue was ignored; too small to be economical---especially with trading costs!

## Trading on small autocorrelation

Two keys to taking advantage of this small autocorrelation: 

1. Trade the **extreme “winners” and “losers”**
 * Small autocorrelation multiplied by large returns gives sizeable return in the following period. 
 * By additionally shorting the biggest “losers”, we can magnify this further. 
2. Hold a **portfolio of many** “winners” and “losers.”
 * By holding a portfolio of such stocks, diversifies the idiosyncratic risk. 
 * Very small $R^2$ stat for any individual autoregression, but can play the odds (ie. rely on the small $R^2$) across 1000 stocks all at the same time.

## Illustration: Workings of momentum

* Assume each stock $i$ has returns which evolve over time as
$$ \left(r_{t+1}^i - \underbrace{0.83\%}_{mean}\right) = \underbrace{0.05}_{autocorr}\left(r^{\scriptscriptstyle {i}}_t - \underbrace{0.83\%}_{mean}\right) + \epsilon_{t+1} $$
* Assume there is a continuum of stocks, and their cross-section of returns for any point in time, $t$, is distributed as
$$ r^{\scriptscriptstyle {i}}_t \sim \mathcal{N}\left(0.83\%,11.5\%\right) $$

## Illustration: normality

From the normal distribution assumption,

* The top 10% of stocks in any given period are those with returns greater than 1.28$\sigma$.
* Thus, the mean return of these “winners” is found by calculating the conditional mean:
$$ \mathbb{E}\left[r\ |\ r > 1.28\sigma\right] = \frac{\int_{1.2816}^\infty r \phi(r)dr}{\int_{1.2816}^\infty \phi(r)dr} $$
where $\phi(x)$ is the pdf of the normal distribution listed above.
* For a normal distribution, we have a closed form solution for this conditional expectation, (mean of a truncated normal,)
$$ \mathbb{E}\left[r\ |\ r > 1.28\sigma\right] = 1.755\sigma = 21.01\%. $$

## Illustration: autocorrelation

From the autocorrelation assumption:

* A portfolio of time $t$ winners, $r^{\scriptscriptstyle {u}}$, is expected to have a time $t+1$ mean return of
$$ \mathbb{E}_t\left[r^{\scriptscriptstyle {u}}_{t+1}\right] = 0.83\% + .05\left(1.755\sigma - 0.83\%\right) = 1.84\% $$
* We assumed that the average return across stocks is 0.84%.
* Thus, the momentum of the winners yields an additional 1% per month.
* Going long the winners as well as short the losers doubles this excess return.

## Implementing a momentum strategy over time

A **momentum** strategy with equities is formed by ranking securities on recent realized return. 

* Go long on the portfolio of recent periods's biggest winners and go short recent period's biggest losers. 
* After holding the “momentum” portfolio for some time period, re-rank the “winners” and “losers”. 
* Re-sorting frequently is important as the securities move frequently in and out of “winner/loser” rankings.

## Updating the rankings

table here

* 5 of the 17 stocks which moved in and out of “winners” of the Russell 1000. (ie. Joined or dropped from top-10% of the index.) 
* Ranked by cumulative one-year return from Oct. 2013 - Sep. 2014, and then re-ranked one month later based on cumulative return from Nov. 2013 - Oct 2014.

## Trading costs versus momentum returns

Resorting frequency must balance two objectives:

* Minimizing trading costs. 
* Updating portfolio to hold highest-momentum assets.

For US Equities, monthly excess returns up to 0.67% per month---before trading costs.

## Trading costs

Often claimed that momentum does not survive net of trading costs.

**Transaction costs.** 
* Transaction costs would be overwhelming for a retail investor.
* But institutional investors have much smaller costs. 
* Can delay or adjust portfolio rebalancing to lessen turnover.

**Tax burden.**
* Lots of trading may induce large capital gains taxes. 
* But selling losers, (reaping capital losses) and holding winners (delaying capital gains.)
* Also, momentum stocks tend to have relatively low dividend yields, avoiding inefficient dividend taxation.

## Widespread momentum

Momentum strategies in many asset classes deliver excess returns. 

* International equities and equity indices
* Government bonds
* Currencies
* Commodities
* Futures

## Evidence: Momentum returns

|                 | Excess return | CAPM alpha | Sharpe ratio |
|-----------------|:-------------:|:----------:|:------------:|
| U.S. stocks     |    5.8%       |   7.2%     |    0.86      |
| Global stocks   |    5.3%       |   5.8%     |    1.21      |
| Currencies      |    5.6%       |   5.7%     |    0.69      |
| Commodities     |   17.1%       |  17.1%     |    0.77      |

_Table: Excess returns to momentum strategies_

* Source: Asness, et.al. 2013. Table 1. 
* Annualized estimates. Monthly data, 1972-2011.
* See paper for t-stats.

## Risk-based explanations

Is momentum strategy associated with some risk?

* Volatility? 
* Correlation to market index, such as the S&P?
* Business-cycle correlation?
* Tail risk?
* Portfolio rebalancing risk?

## Behavioral explanations

Can investor behavior explain momentum?

**Under-reaction** to news.
 * At time $t$, positive news about stock pushes price up 5%. 
 * At time $t+1$, investors fully absorb the news and stock goes up another 1% to rational equilibrium price. 

**Over-reaction** to news.
 * At time $t$, positive news about stock pushes price up 5%---to rational equilibrium. 
 * At time $t+1$, investors are overly optimistic about the news and recent return. Stock goes up another 1%.

## Explaining momentum

Years of debate regarding the explanation for momentum.

* Any evidence for the rational explanation? Can we specify the risk that makes investors reluctant to engage in momentum strategies? 
* Suppose we believe the cause is behavioral. How can we distinguish between the two, (opposite!) behavioral theories on the previous slide?

## Momentum in practice

Momentum is one of the most popular strategies used by managed funds. 

* The lack of a perfect explanation of momentum has not kept funds from using it! 
* It is popular not just for the large excess returns but also due to its potential help in diversification---given its low correlation with other popular strategies, (such as value-investing.)
* Even accessible to retail investors through mutual-fund-type products.

# APT

## The APT

**Arbitrage pricing theory (APT)** gives conditions for when a Linear Factor Decomposition of return **variation** implies a Linear Factor Pricing for **risk premia**. 

* The assumptions needed will not hold exactly. 
* Still, it is commonly used as a way to build LFP for risk premia in industry.

## APT factor structure

Suppose we have some excess-return factors, $\textbf{x}$, which work well as a LFD.
$$ \tilde{r}^{\scriptscriptstyle {i}}_t = \alpha^i + \left(\boldsymbol{\beta}^{\scriptscriptstyle {i,\textbf{x}}}\right)'\textbf{x}_t + \epsilon^i_t $$ 

**APT Assumption:** The residuals are uncorrelated across regressions
$$ \text{corr}\left[\epsilon^i,\epsilon^j\right] = 0, \hspace{.2cm} i\ne j $$
That is, the factors completely describe return comovement.

## A Diversified Portfolio

Take an equally weighted portfolio of the $n$ returns
$$ 
\tilde{r}^{\scriptscriptstyle {p}}_t = \frac{1}{n}\sum_{i=1}^n \tilde{r}^{\scriptscriptstyle {i}}_t \\
= \alpha^p + \left(\beta^{\scriptscriptstyle {p,\textbf{x}}}\right)'\textbf{x}_t + \epsilon^p_t 
$$
where
$$ \alpha^p = \frac{1}{n}\sum_{i=1}^n \alpha^i, \hspace{.5cm} \beta^{\scriptscriptstyle {p,\textbf{x}}} = \frac{1}{n}\sum_{i=1}^n \boldsymbol{\beta}^{\scriptscriptstyle {i,\textbf{x}}}, \hspace{.5cm} \epsilon^p = \frac{1}{n}\sum_{i=1}^n \epsilon^i_t $$

## Idiosyncratic variance

The idiosyncratic risk of $\tilde{r}^{\scriptscriptstyle {p}}_t$ depends only on the residual variances. 

* By construction, the residuals are uncorrelated with the factors, $\textbf{x}$.
* By assumption, the residuals are uncorrelated with each other. 

$$ \text{var}\left[\epsilon^p\right] = \frac{1}{n}\overline{\sigma_\epsilon}^2 $$
where $\overline{\sigma_\epsilon}^2$ is the average variance of the $n$ assets.

## Perfect factor structure

As the number of diversifying assets, $n$, grows
$$ \lim_{n\to\infty} \text{var}\left[\epsilon^p\right] = 0 $$

Thus, in the limit, $\tilde{r}^{\scriptscriptstyle {p}}$ has a perfect factor structure, with no idiosyncratic risk:
$$ \tilde{r}^{\scriptscriptstyle {p}}_t= \alpha^p + \left(\beta^{\scriptscriptstyle {p,\textbf{x}}}\right)'\textbf{x}_t $$

This says that $\tilde{r}^{\scriptscriptstyle {p}}$ can be perfectly replicated with the factors $\textbf{x}$. "This leaves a residual position of By no arbitrage,

## Obtaining the LFP in x

**APT Assumption 2:** There is no arbitrage. 

Given that $\tilde{r}^{\scriptscriptstyle {p}}$ is perfectly replicated by the return factors, $\textbf{x}$, then
$$ \alpha^p = 0 $$
Thus, taking expectations of both sides, we have a LFP:
$$ \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {p}}\right] = \left(\beta^{\scriptscriptstyle {p,\textbf{x}}}\right)' \boldsymbol{\lambda}^x $$
where
$$ \boldsymbol{\lambda}^x = \mathbb{E}\left[\textbf{x}\right] $$

## Explaining variation and pricing

The APT comes to a stark conclusion:

* Assume we find a Linear Factor Decomposition (LFD) that works so well it leaves no correlation in the residuals. 
* That is, the set of factors explains **realized** returns across **time**. (Covariation)
* The APT concludes the factors must also describe **expected** returns across **assets**. (Risk premia)

That is, a perfect LFD will also be a perfect LFP!

# Economic Factors (CCAPM)

## Non-return factors

What if we want to use a vector of factors, $\boldsymbol{z}$, which are not themselves assets? 

* Examples include slope of the term structure of interest rates, liquidity measures, economic indicators, etc.
* The time-series tests of LFM relied on,
$$ \boldsymbol{\lambda}_{\boldsymbol{z}} = \mathbb{E}\left[\boldsymbol{\tilde{r}^{\scriptscriptstyle {\boldsymbol{z}}}}\right], \hspace{1cm} \boldsymbol{\alpha} = \textbf{0} $$
But with untraded factors, $\boldsymbol{z}$, we do not have either implication. 
* Thus to test an LFM with untraded factors, we must do the cross-sectional test.

## The CCAPM

The **Consumption CAPM** (CCAPM) says that the only systematic risk is consumption growth. 
$$ \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {i}}\right]= \beta^{\scriptscriptstyle {i,c}}\, \lambda_c $$
where $c$ is some measure of consumption growth. 

* The challenge is specifying a good measure for $c$.
* The CAPM can be seen as a special case where $c = \tilde{r}^{\scriptscriptstyle {m}}$. 
* Generally, measures of $c$ is a non-traded factor.
* We could build a replicating portfolio, or test it directly in the cross-section.

## Testing the CCAPM across assets

1. Run the time-series regression for each test-security, $i$. 
$$ \tilde{r}^{\scriptscriptstyle {i}}_t = a^i + \beta^{\scriptscriptstyle {i,c}} c_t + \epsilon^{i}_t $$
The intercept is denoted $a$ to emphasize it is not an estimate of model error, $\alpha$. 
2. Run the single cross-sectional regression to estimate the premium, $\lambda_c$ and the residual pricing errors, $\alpha^i$.
$$ \mathbb{E}\left[\tilde{r}^{\scriptscriptstyle {i}}\right] =\,\lambda_c\, \beta^{\scriptscriptstyle {i,c}} +\; \alpha^i $$ 
As usual, the theory implies the cross-sectional regression should not have an intercept, but it is often included.

## Evidence for CCAPM: consumption beta and returns

figure here

## Model with alternate consumption measurement

figure here

## Macro factors

A number of industry models use non-traded, macro factors. 

* GDP growth
* Recession indicator
* Monetary policy indicators
* Market volatility

Consumption factors are widely studied in academia, but less in industry.

## Factor-mimicking returns

**Factor-mimicking returns** are the linear projection of non-return factors onto the space of traded returns, $\boldsymbol{r}$:
$$ \boldsymbol{\tilde{r}^{\scriptscriptstyle {\boldsymbol{z}}}}= \mathbb{L}\left(\boldsymbol{z}~ |\ \boldsymbol{r}\right) $$
Recall that a linear projection can be calculated simply by regressing $\boldsymbol{z}$ on the available security returns, $\boldsymbol{r}$. 

* If there is a LFM in $\boldsymbol{z}$, then there is also a LFM in the factor-mimicking portfolios, $\boldsymbol{\tilde{r}^{\scriptscriptstyle {\boldsymbol{z}}}}$. 
* Then we are back to having an LFM in tradable factors, $\boldsymbol{\tilde{r}^{\scriptscriptstyle {\boldsymbol{z}}}}$.

# Appendix: PCA

## Principal components

The **principal components** of returns are statistical factors which maximize the amount of return variation explained. 

* $\boldsymbol{\tilde{r}}$ denotes an $n\times 1$ random vector of excess returns with covariance matrix $\boldsymbol{\Sigma}$. 
* The first **principal component** of returns, $x$ is characterized by a vector of excess return loadings, $x^1_t =\textbf{q}_1'\boldsymbol{\tilde{r}}_t$ which solves,
$$ 
\max_{\textbf{q}} ~ \textbf{q}'\boldsymbol{\Sigma}\textbf{q} \\
\text{s.t.} ~ \textbf{q}'\textbf{q} = 1 
$$
* Thus, $x^1_t = \textbf{q}_1'\boldsymbol{\tilde{r}}_t$ is the portfolio return with maximum variance.

## General definition of principal components

The $i$th principal component, $x^i_t = \textbf{q}_i'\boldsymbol{\tilde{r}}_t$, has loading vector, $\textbf{q}_i$, solves the same problem as above, but with the additional constraint that it be uncorrelated to the previous $i-1$ principal components:
$$ 
\max_{\textbf{q}} ~ \textbf{q}'\boldsymbol{\Sigma}\textbf{q}\\[5pt]
\text{s.t.} ~ \textbf{q}'\textbf{q}_j = \begin{cases} 1 & i=j\\ 0 & i\ne j\end{cases} 
$$

## Eigenvector decomposition

The covariance matrix of returns has the following eigenvector decomposition:
$$ \boldsymbol{\Sigma} = \textbf{Q}'\Psi \textbf{Q} $$

* $\textbf{Q}$ is an $n\times n$ matrix where each column is an eigenvector, $q_i$. 
* $\Psi$ is an $n\times n$ diagonal matrix of eigenvalues, $\psi_i$.
* The eigenvectors are orthonormal: $\textbf{Q}'\textbf{Q} = \mathcal{I}$.

## Eigenvectors as principal components

It turns out that the solution to the principal components problem is given by the eigenvectors of $\Sigma$. 

* The variance of principal component $i$ is
$$ \text{var}\left[x^i\right] = \textbf{q}_i'\boldsymbol{\Sigma}\textbf{q}_i = \psi_i $$
* The first principal component has maximum variance, so its weight vector is the eigenvector associated with the largest eigenvalue.

## Factor model of principal components

Not only do we have the principal component factors as linear combinations of the returns,
$$ \textbf{x}_t = \textbf{Q}'\tilde{r}_t $$
But we can multiply both sides by $\textbf{Q}$ to find that returns can be decomposed into a linear combination of the principal components:
$$ \tilde{r}_t = \textbf{q}_1 x^1_t + \textbf{q}_2 x^2_t + \ldots + \textbf{q}_n x^n_t $$

Of course, using $n$ factors to describe returns on $n$ assets is not useful.

## Reduction in factors

* The point of principal component models is to use a much smaller subset of the principal components to explain most of the variation. 
* For instance, one might use just three principal components in order to describe the variation of 20 or 50 different return series.

## Selecting the PC model

Consider that the percent of the variance of returns explained by principal component $i$ is
$$ \frac{\psi_i}{\sum_{j=1}^n \psi_j} $$

Consider the percent of total variation explained by just these $k$ PC factors:
$$ \frac{\sum_{j=1}^k \psi_j}{\sum_{j=1}^n \psi_j} $$
If a subset of $k$ can explain most of the variation, this may be a good factor decomposition for the return variation.